CenTauR: Toward a universal scale and masks for standardizing tau imaging studies

Abstract INTRODUCTION Recently, an increasing number of tau tracers have become available. There is a need to standardize quantitative tau measures across tracers, supporting a universal scale. We developed several cortical tau masks and applied them to generate a tau imaging universal scale. METHOD One thousand forty‐five participants underwent tau scans with either 18F‐flortaucipir, 18F‐MK6240, 18F‐PI2620, 18F‐PM‐PBB3, 18F‐GTP1, or 18F‐RO948. The universal mask was generated from cognitively unimpaired amyloid beta (Aβ)− subjects and Alzheimer's disease (AD) patients with Aβ+. Four additional regional cortical masks were defined within the constraints of the universal mask. A universal scale, the CenTauRz, was constructed. RESULTS None of the regions known to display off‐target signal were included in the masks. The CenTauRz allows robust discrimination between low and high levels of tau deposits. DISCUSSION We constructed several tau‐specific cortical masks for the AD continuum and a universal standard scale designed to capture the location and degree of abnormality that can be applied across tracers and across centers. The masks are freely available at https://www.gaain.org/centaur‐project.

cognitive decline in AD. [4][5][6][7][8][9][10][11] In addition to the idiosyncratic characteristics of tau aggregates, and their asymmetric and heterogeneous brain distribution, a major obstacle to the widespread implementation of tau imaging in therapeutic trials or comparing the findings of investigational imaging studies across cohorts and institutions is that tau tracers differ in their molecular structures and display a range of tau binding affinities, in vivo kinetics, and degree of non-specific binding, as well as distinct regional patterns of "off-target" and non-specific binding. Such differences lead to disparities in PET-derived standardized uptake value ratio (SUVR) measurements between tracers, as highlighted by several head-to-head studies comparing different tau tracers. 12,13 It is also important to note that most of these tau tracers do not reach apparent steady state in regions with high tau pathology during the scanning period, and while the use of semi-quantitative estimates such as SUVR was adopted early in the implementation of these tracers as a compromise to make PET imaging studies less burdensome to clinical populations, a priori kinetic modeling studies of tau tracers in early development stages may have led to further optimization of scanning protocols to be less biased to tau signal. [14][15][16][17] When added to the use of diverse quantitative approaches and different regions of interest, these methodological differences conspire to decrease reproducibility and pose a challenge when trying to compare tau outcomes across cohorts or in therapeutic trials that use different tau tracers. A further obstacle within the tau PET field is the definition of a reliable, consistent, and reproducible threshold of abnormality across tracers. One of the issues relates to the actual utility of a cut-off given the continuous nature of Aβ or tau deposition. 18,19 While thresholds are arbitrary, to adopt one, it needs to be shown that it is relevant and accurate from a diagnostic and/or prognostic point of view. 20,21 In essence, biomarker thresholds should be adopted for a specific purpose that is directly related to the clinical question under scrutiny. From a clinical perspective, a visual binary (positive/negative) status will help separate those subjects with a significant aggregated protein burden in the brain that is likely to explain the clinical syndrome from those with a low pathologic burden that is likely to be clinically insignificant. Similar dilemmas arise in research settings.
In response to similar challenges faced earlier with Aβ PET, 22 a standardization method was developed whereby Aβ PET outcome data acquired using different Aβ tracers and methods was normalized to a 100-point scale, the units of which were termed "Centiloids," using a linear scaling procedure. 22 While the method transforms all Aβ tracers' semiquantitative results into a single universal scale and because sampling was only based on 11 C-Pittsburgh compound B, the idiosyncratic binding properties of these Aβ tracers remain unaccounted for so they might be more or less sensitive or accurate for making a statement about a similar index of cerebral Aβ burden. Furthermore, while the pattern of Aβ deposition throughout the brain is relatively uniform across subjects, and thus a single universal target mask provides reproducible statements of Aβ in the brain, the deposition of tau, especially at the early stages, tends to be more heterogeneous, 23 requiring a more regional approach to the sampling of target areas.
In the present study, we aimed to standardize tau PET results by establishing the location and amount of abnormality of tau aggregates in the brain, and expressing them in a universal standard scale, the unit of which are termed "CenTauR"-using tau PET data from the six most commonly used tracers ( 18 F-flortaucipir, 18 F-MK6240, 18 F-PI2620, 18 F-PM-PBB3, 18 F-RO948, and 18 F-GTP1) and an approach similar to the one used in the Centiloid project.

METHODS
This study involved 1045 participants from various cohorts (Aus- Method S1 in supporting information). All participants were assigned a diagnosis of cognitively unimpaired (CU), mild cognitive impairment (MCI), or AD dementia or other dementia (OD) by the entity providing the data. Criteria for assigning participant diagnosis can be found elsewhere. 15,[24][25][26][27] Aβ status (Aβ+ or Aβ−) was defined using either Aβ PET or the Aβ42/Aβ40 ratio in cerebrospinal fluid (CSF). Analysis of variance was used to determine any significant demographic difference between cohorts.

Image processing
Tau scans were spatially normalized using principal component analysis (PCA) based on Computational Analysis of PET by AIBL (CapAIBL), 28 which is a publicly available cloud-based platform in which PET images are spatially normalized to a standard template using an adaptive atlas approach (https://capaibl-milxcloud.csiro.au), and Statistical Parametric Mapping (SPM, version 8) using the standard pipeline for the Centiloid method (CL-SPM) described in Klunk et al. 22 For more detailed information on the Centiloid pipeline, including MATLAB commands, please refer to Method S2 in the supporting information. All spatially normalized scans were visually assessed to ensure proper registration, especially in the mesial temporal lobe (MTL). 29 In the case of CL-SPM, all scans that did not pass visual assessment were reprocessed using a different orientation matrix until they passed a visual quality check (QC). Scans that failed visual QC three times in a row were excluded from further analysis. In the CU group, Aβ− scans were excluded if the presence of tau was visually detected in the cortex or in the MTL. We defined a sub-cerebellar cortex region based on the Centiloid cerebellum cortex mask as reference region, excluding the upper portion (slice > −37) of the cerebellum to avoid off-target binding often observed in the cerebellar vermis, and also the lower part (slice < −47) to avoid quantification challenges such as partial volume, low axial sensitivity, and out-of-field scatter ( Figure S1 in supporting information). The same reference region was used for the CL-SPM and CapAIBL pipelines.
For each tracer and normalization approach (i.e., CapAIBL, CL-SPM), we averaged all CU Aβ− and AD Aβ+ scans separately, generating mean CU Aβ− and AD Aβ+ images. We then subtracted the CU Aβ− mean image from the AD Aβ+ mean image to generate a difference image.
After exploring several thresholds, the resultant difference-image was thresholded at one third of the difference in the inferior temporal lobe. This threshold produced large and consistent volumes of interest across tracers of areas of the brain with the greatest tau load.
We then constructed a "universal" tau mask from the intersection (i.e., spatial overlap) of the six tracer-specific masks. An MRI-derived gray matter mask obtained from the FreeSurfer segmentation of 100 MRIs (independent dataset) at PET resolution was then applied to the uni-

Visual topographical subtype classification
Seventy-eight 18 F -MK6240 AD Aβ+ scans from the AIBL cohort were visually rated by two readers (C.C.R. and N.K.), blind to participant characteristics, resulting in consensus visual reads, as previously described. 31 Briefly, scans were rated as (1)

RESULTS
Participant characteristics by tau PET tracer are summarized in Table   S1 in supporting information. Overall, participants from the 18

Tau mask sampling
Twenty-three scans (eight 18 F-RO948, one 18 F-GTP1, five 18 F-PI2620, one 18 F-FTP, eight 18 F-PM-PBB3) did not pass visual QC using the CL-SPM pipeline or did not have an MRI of sufficient quality while only one scan did not pass visual QC using both CapAIBL and CL-SPM. A further six CU Aβ-were visually excluded due to the presence of tracer uptake in the MTL. These 29 scans were excluded from further analysis.
CL-SPM tracer-specific masks showed a reasonable overlap ( Figure   S2 in supporting information), with a global Dice score of 0.58 (95% confidence interval [CI], 0.52-0.61) and a Dice score in the cortical mask of 0.61 (95% CI, 0.60-0.69). The mean Dice score obtained when comparing paired tracer-specific masks was 0.85 (Table S2 in supporting information). All masks included the mesial temporal, metatemporal, posterior cingulate/precuneus, and subfrontal regions. The CenTauR mask overlaid on an MRI template is shown in Figure 1, while the subregion masks are shown in Figure S3 in supporting information.
None of the known off-target signal regions were discernible in the five masks ( Figure S4 in supporting information).
Both quantitative pipelines provided very similar tau masks, with a Dice score of 0.75 between universal masks generated using CapAIBL and CL-SPM. Part of this difference was due to the normalized space of CapAIBL, which is different from the Montreal Neurological Institute space, the CL-SPM mask required resampling to be compared to the CapAIBL mask. In the remainder of this paper, we only use the masks defined using the CL-SPM pipeline.

CapAIBL versus CL-SPM pipeline
The equations to convert CapAIBL SUVR values into CTR z scores are presented in Table S3

DISCUSSION
In the present work we described the CenTauR z scale, a method that facilitates the expression of the level of abnormality of the semiquantitative tau PET signal at both a regional and global level. Also, the CenTauR z scale allows, by incorporating the intrinsic "noise" of each tau tracer into the measurement, the generation of a universal scale of tau pathologic burden across tracers. The two pipelines used to quantify brain PET imaging (CapAIBL and CL-SPM) generated consistent results in quantifying tau scans in all ROIs, with high discriminative power in distinguishing AD Aβ+ from CU Aβ− and tau negative scans from limbic predominant, hippocampal sparing, and typical AD tau scans when using a threshold of > 2 CTR z in different ROIs.
An important aspect, both for clinical interpretation and for therapeutic trials, is the selection of brain regions sampled to capture the distribution of tau, how this index of tau load changes over time, and what CTR z level is considered high tau. 33 Given the low spatial resolution of PET, it can be counterproductive to impose a neuropathological piecemeal staging system, such as those proposed by Braak and Braak 34 or Delacourte,35 to the sampling of tau PET images. 36,37 Atypical and heterogeneous presentations of tau deposits, and how they intimately relate to the clinical phenotype, 34,35 are missed by the incre-mentally sequential Braak staging. Applying the Braak or Delacourte staging 34,35 is further complicated by the different neuropathological subtypes of tau deposition in AD. 38 From the pathological AD subtypes, only the typical (reported to be between 55%-75% in different series) [39][40][41] completely fulfills the sequential Braak stages.
Several reports have shown that a meta-temporal region, 42 or a temporoparietal (including posterior cingulate) AD-signature region 43,44 outperforms the Braak staging for the early detection of cortical tau, for establishing the differential diagnosis of AD versus non-AD neurodegenerative conditions, 45 as well as for capturing longitudinal changes in cortical tau signal. These regions seem to perform reliably across different tau tracers and use sites and, despite these tracers presenting different dynamic ranges, they yielded the same cut-off for abnormality in different cohorts. 46 While the use of tau imaging for disease staging is strongly recommended, 47 the use of neuropathological staging should be applied carefully, not as an a priori condition, but as the result of the actual observed pattern of tau deposition on the PET images. Furthermore, it has been shown that tau imaging, at least with 18 F-FTP, 48 can reliably detect a B3 stage (equivalent to Braak V-VI), so attempting to classify earlier Braak stages using this tracer, with its high level of non-specific binding, 49 would likely yield less reliable results.
Similar issues may apply to other tau tracers. Such considerations argue against using current neuropathological staging approaches, especially because it progresses from very small regions (Braak I-II) that are susceptible to partial volume effects and easily contaminated by offtarget binding, to very large regions (Braak V-VI) that encompass large portions of the cerebral cortex and subcortical structures, making it impractical for implementation in clinical studies, and foremost, in therapeutic trials. Our method is designed to capture tau levels and distribution in the brain as well as tau progression and most of the reported heterogeneities in tau PET studies, such as primary agerelated tauopathy (PART) and proposed subtypes and heterogeneity in the patterns of tau distribution. 31,50 Similar methods can be used to select a brain region as reference to scale the tissue ratios. Attempts to define a universal cerebellar tau mask are already underway, 51 but will require testing with all tau tracers to assess whether it improves the CTR z accuracy.  Figure S13]); and (4) it provides a comprehensive scheme to facilitate and standardize head-to-head comparisons between tau tracers. 53,54 Moreover, and in contrast to the Centiloid approach, by incorporating the tracer-specific "noise" into the measurement, the CenTauR z approach provides a more robust and meaningful underpinning for head-to-head comparisons between these tracers. Last, the modular approach also allows the examination of certain brain regions separately given that they behave differently over time, with for example the MTL accumulating tau early but also plateauing early, or the temporoparietal that seems to be the most sensitive region to capture tau accumulation in the brain, and likely large enough to provide robust statements of changes in tau burden in a clinical trial. 55,56 In conclusion, we constructed several universal tau PET-specific cortical masks for the AD continuum based on all the commonly used tau tracers, and a universal standard scale, the CenTauR z , designed to capture the location and degree of abnormality of tau pathology that can be applied across tracers and across centers. While the CenTauR scheme does not answer all questions about measuring tau deposits, it establishes a robust and reproducible standard framework from which to build upon, and to be implemented in the clinic and applied in therapeutic trials.

ACKNOWLEDGMENTS
The research was supported by the Australian federal government through NHMRC grants APP1132604, APP1140853, and APP1152623 and by a grant from Enigma Australia.

CONSENT STATEMENT
All participants gave written consent for publication of de-identified data.

COLLABORATORS
The